Closed
Bug 1136762
Opened 10 years ago
Closed 9 years ago
TSan: data race xpcom/io/nsPipe3.cpp:1061 CloseWithStatus
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
FIXED
mozilla47
Tracking | Status | |
---|---|---|
firefox37 | --- | unaffected |
firefox38 | --- | affected |
firefox39 | --- | affected |
firefox47 | --- | fixed |
People
(Reporter: froydnj, Assigned: jseward)
References
(Blocks 1 open bug)
Details
(Whiteboard: [tsan])
Attachments
(2 files, 1 obsolete file)
The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer).
* Specific information about this bug
It looks like we're trying to close the input stream of the pipe from two different threads? And the second close is racing with the first insofar as it's checking a non-synchronized value.
I would be really curious to know why we're sharing the same stream here between two different threads, especially since the main thread use looks like it's coming from JS...maybe this is the HTTP server used for mochitests.
* General information about TSan, data races, etc.
Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause unacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1][2].
If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist.
[1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
[2] _How to miscompile programs with "benign" data races_: https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf
![]() |
Reporter | |
Comment 1•10 years ago
|
||
You can use all the Bugzilla template helpers you want, but it would be *super* useful if the template yelled at you when you forgot to fill out a field.
Comment 2•10 years ago
|
||
I won't be able to look at this until next week.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
status-firefox37:
--- → unaffected
status-firefox38:
--- → affected
status-firefox39:
--- → affected
Comment 3•10 years ago
|
||
This is most likely happening due to:
1) Thread 1 is STS writing to nsPipeOutputStream with NS_AsyncCopy() using the auto-close feature.
2) Thread 2 is reading from an nsPipeInputStream and is closing because it hit an exception at the same time. Probably because the pipe was closed from the writing side.
In this case its the nsPipeInputStream::mInputStatus field thats being accessed in two places. It might be as easy as marking this Atomic<nsresult>, but I need to look at the race more closely.
Comment 4•10 years ago
|
||
Hmm, no. I need to lock the monitor anywhere I touch the pipe mStatus or input stream's mInputStatus. I should add a lot of locking assertions around these methods as well.
Assignee | ||
Comment 5•9 years ago
|
||
Ben, is there any possibility to move this along? Once bug 1141565 lands,
this will be the #1 reported race when running the xpcshell test suite:
123 xpcom/io/nsPipe3.cpp:1205:7 in CloseWithStatus <--------------
116 nsprpub/pr/src/io/prlog.c:358:20 in PR_NewLogModule
116 netwerk/base/nsIOService.cpp:1448:19 in nsIOService::Observe(nsISupp
43 modules/libpref/Preferences.cpp:1769:38 in mozilla::UintVarChanged(c
38 ipc/glue/MessagePump.cpp:142:7 in mozilla::ipc::MessagePump::Schedul
11 js/src/jsscript.cpp:2529:58 in MarkScriptData
9 tools/profiler/core/platform-linux.cc:250:46 in (anonymous namespace
9 tools/profiler/core/platform-linux.cc:249:46 in (anonymous namespace
9 tools/profiler/core/platform.h:350:34 in IsActive
Assignee | ||
Comment 6•9 years ago
|
||
Easy STR: do a TSan-enabled build. Then:
TSAN_OPTIONS=suppressions=/home/sewardj/MOZ/SUPPS/tsan-empty.txt \
DISPLAY=:1.0 ./mach xpcshell-test --sequential --log-mach - \
browser/experiments/test/xpcshell/test_activate.js
Comment 7•9 years ago
|
||
Nathan, can you take this? I'm swamped with service worker stuff still.
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 9•9 years ago
|
||
I audited all uses of mStatus and mInputStatus and found three places
where the monitor should have been held, but wasn't. I also found a
race on mOriginalInput in nsPipe::Release. The attached patch fixes
all of these.
Assignee | ||
Updated•9 years ago
|
Attachment #8715253 -
Flags: feedback?(bkelly)
Comment 10•9 years ago
|
||
Comment on attachment 8715253 [details] [diff] [review]
bug1136762-1.cset
Review of attachment 8715253 [details] [diff] [review]:
-----------------------------------------------------------------
Looks reasonable. Thanks!
Attachment #8715253 -
Flags: feedback?(bkelly) → feedback+
Assignee | ||
Updated•9 years ago
|
Attachment #8715253 -
Flags: review?(nfroyd)
Updated•9 years ago
|
Assignee: bkelly → jseward
![]() |
Reporter | |
Comment 11•9 years ago
|
||
Comment on attachment 8715253 [details] [diff] [review]
bug1136762-1.cset
Review of attachment 8715253 [details] [diff] [review]:
-----------------------------------------------------------------
This needs a few changes first.
::: xpcom/io/nsPipe3.cpp
@@ +525,5 @@
> return 0;
> }
> + // Avoid racing on |mOriginalInput| by only looking at it when
> + // the refcount is 1, that is, we are the only pointer (hence only
> + // thread) to access it.
Thank you for this comment.
@@ +1179,5 @@
> {
> LOG(("nsPipeInputStream::OnInputException [this=%x reason=%x]\n",
> this, aReason));
>
> + ReentrantMonitorAutoEnter mon(mPipe->mReentrantMonitor);
Why is this necessary? AFAICT, we only ever call this with the monitor held. (And so we ought to change it so it takes a ReentrantMonitorAutoEnter&...)
@@ +1490,5 @@
>
> nsresult
> nsPipeInputStream::Status() const
> {
> + ReentrantMonitorAutoEnter mon(mPipe->mReentrantMonitor);
Please make a separate Status(const ReentrantMonitorAutoEnter&) method, move the |return NS_FAILED(...)| bit into that method, and use the new method wherever possible. (Almost every call to Status() can use the new method, AFAICT.) There's no reason to require unnecessary monitor acquisition for something so trivial as status flags.
Attachment #8715253 -
Flags: review?(nfroyd) → review-
Assignee | ||
Comment 12•9 years ago
|
||
Addresses all review comments in comment 11, and also ..
> @@ +1179,5 @@
> > {
> > LOG(("nsPipeInputStream::OnInputException [this=%x reason=%x]\n",
> > this, aReason));
> >
> > + ReentrantMonitorAutoEnter mon(mPipe->mReentrantMonitor);
>
> Why is this necessary? AFAICT, we only ever call this with the monitor held. > (And so we ought to change it so it takes a ReentrantMonitorAutoEnter&...)
.. it makes this method (OnInputException) take a
const ReentrantMonitorAutoEnter&...)
as "evidence", and also does the same for OnInputReadable.
Attachment #8715253 -
Attachment is obsolete: true
Assignee | ||
Updated•9 years ago
|
Attachment #8718869 -
Flags: feedback?(nfroyd)
![]() |
Reporter | |
Comment 13•9 years ago
|
||
Comment on attachment 8718869 [details] [diff] [review]
bug1136762-2.cset
Review of attachment 8718869 [details] [diff] [review]:
-----------------------------------------------------------------
WFM, thank you for making the broader fiddly changes. Consider this an r+.
Attachment #8718869 -
Flags: review+
Attachment #8718869 -
Flags: feedback?(nfroyd)
Attachment #8718869 -
Flags: feedback+
Comment 14•9 years ago
|
||
Comment 15•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox47:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in
before you can comment on or make changes to this bug.
Description
•